Skip to content

Initial raytraced lighting progress (bevy_solari) #19058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 74 commits into
base: main
Choose a base branch
from

Conversation

JMS55
Copy link
Contributor

@JMS55 JMS55 commented May 4, 2025

Bevy Solari

Preface

  • See release notes.
  • Please talk to me in #rendering-dev on discord or open a github discussion if you have questions about the long term plan, and keep discussion in this PR limited to the contents of the PR :)

Connections

This PR

After nearly two years, I've revived the raytraced lighting effort I first started in #10000.

Unlike that PR, which has realtime techniques, I've limited this PR to:

  • RaytracingScenePlugin - BLAS and TLAS building, geometry and texture binding, sampling functions.
  • PathtracingPlugin - A non-realtime path tracer intended to serve as a testbed and reference.

What's implemented?

image

  • BLAS building on mesh load
  • Emissive lights
  • Directional lights with soft shadows
  • Diffuse (lambert, not Bevy's diffuse BRDF) and emissive materials
  • A reference path tracer with:
    • Antialiasing
    • Direct light sampling (next event estimation) with 0/1 MIS weights
    • Importance-sampled BRDF bounces
    • Russian roulette

What's not implemented?

  • Anything realtime, including a real-time denoiser
  • Integration with Bevy's rasterized gbuffer
  • Specular materials
  • Non-opaque geometry
  • Any sort of CPU or GPU optimizations
    • BLAS compaction, proper bindless, and further RT APIs are things that we need wgpu to add
  • PointLights, SpotLights, or skyboxes / environment lighting
  • Support for materials other than StandardMaterial (and only a subset of properties are supported)
  • Skinned/morphed or otherwise animating/deformed meshes
  • Mipmaps
  • Adaptive self-intersection ray bias
  • A good way for developers to detect whether the user's GPU supports RT or not, and fallback to baked lighting.
  • Documentation and actual finalized APIs (literally everything is subject to change)

End-user Usage

  • Have a GPU that supports RT with inline ray queries
  • Add SolariPlugin to your app
  • Ensure any Mesh asset you want to use for raytracing has enable_raytracing: true (defaults to true), and that it uses the standard uncompressed position/normal/uv_0/tangent vertex attribute set, triangle list topology, and 32-bit indices.
    • If you don't want to build a BLAS and use the mesh for RT, set enable_raytracing to false.
  • Add the RaytracingMesh3d component to your entity (separate from Mesh3d or MeshletMesh3d).

Testing

  • Did you test these changes? If so, how?
    • Ran the solari example.
  • Are there any parts that need more testing?
    • Other test scenes probably. Normal mapping would be good to test.
  • How can other people (reviewers) test your changes? Is there anything specific they need to know?
    • See the solari.rs example for how to setup raytracing.
  • If relevant, what platforms did you test these changes on, and are there any important ones you can't test?
    • Windows 11, NVIDIA RTX 3080.

Copy link
Contributor

@atlv24 atlv24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool, good job

fn sample_cosine_hemisphere(normal: vec3<f32>, rng: ptr<function, u32>) -> vec3<f32> {
let cos_theta = 1.0 - 2.0 * rand_f(rng);
let phi = 2.0 * PI * rand_f(rng);
let sin_theta = sqrt(max(1.0 - cos_theta * cos_theta, 0.0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume the max accounts for the precision problems the reference describes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be. I wrote this code a few years ago, tbh it's just copied from an older version of solari so I no longer remember.

Comment on lines +97 to +98
world_normal: vec3<f32>,
geometric_world_normal: vec3<f32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs here one what the distinction is would be good: im guessing world_normal is the normal as dictated by the mesh, with vertex interpolation, and geometric_world_normal is the normal of the triangle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah basically.

I'm not going to spend time documenting it because this is basically all subject to change. I haven't decided exactly what the APIs are going to look like yet because some passes probably don't even care about normals, or materials, and for shadow rays we don't have BVH intersections and have to use vertex and index buffers where for other rays we can get the positions directly from the BVH, etc.

I plan to figure things out over time and write the APIs as I need them.


let local_normal = mat3x3(vertices[0].normal, vertices[1].normal, vertices[2].normal) * barycentrics; // TODO: Use barycentric lerp, ray_hit.object_to_world, cross product geo normal
var world_normal = normalize(mat3x3(transform[0].xyz, transform[1].xyz, transform[2].xyz) * local_normal);
let geometric_world_normal = world_normal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this assuming "flat" normals? should it be taking a cross product of the edges instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I wrote this code a while ago. But I think I had tested on bistro at some point and it looked fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you leave a note or something at least? this looks wrong, why make a distinction between world normal and geometric normal if they're always the same anyways

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I'll be using the geometric normal for certain things, I think? I'm not really sure what the plan to handle normals is. I need to work more on the realtime bit and then measure perf and see how using normal maps or not and using vertex normals vs geometric normals from the RT intersection affect perf.

Most of the code in that function is temporary while I figure out what I need long term.


// Setup RNG
let pixel_index = global_id.x + global_id.y * u32(view.viewport.z);
let frame_index = u32(old_color.a) * 5782582u;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look like its a frame index, plus isn't old_color.a in [0..1]? so casting it to u32 is basically always zero? so this is not really contributing anything i see old_color.a is the frame count, perhaps move the multiplication down into the rng initialization instead so that the name makes sense

@janhohenheim janhohenheim added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label May 17, 2025
@alice-i-cecile
Copy link
Member

@JMS55 now that we have one review, I'd like to start wrangling a second for you. Can you get CI green for us?

@JMS55
Copy link
Contributor Author

JMS55 commented May 19, 2025

Everything is fixed now, just need to wait for bevyengine/naga_oil#119 to be merged and a new naga_oil release.

Copy link
Contributor

@ecoskey ecoskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial nits on the cpu code. Mostly pretty minor stuff bc it looks great overall. I'll dive into the shaders later but the math seems well-reviewed already

min_binding_size: None,
},
count: MAX_MESH_COUNT,
},
Copy link
Contributor

@ecoskey ecoskey May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add a wrapper struct for it + extension method maybe. Having a few exceptions like this require manually specifying the whole array is painful.

storage_buffer(true).with_count(MAX_MESH_COUNT)? 🤔

BindGroupLayoutEntry {
binding: 5,
visibility: ShaderStages::COMPUTE,
ty: BindingType::AccelerationStructure,
Copy link
Contributor

@ecoskey ecoskey May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, I'd rather extend our dsl where needed to cover cases like this. Even though it's stupid simple it's more code to maintain/review :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to have something like that, but I couldn't figure out how to add the necessary extensions for the binding array sizes, unless I make an entirely new method.

}
}

type StorageBufferList<T> = StorageBuffer<Vec<T>>;
Copy link
Contributor

@ecoskey ecoskey May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this type alias worth it? It doesn't save a ton of space, and at first glance I assumed this was a list of storage buffers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the types above easier to read imo. I could remove it if people dislike it, I don't feel super strongly either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraneous to this pr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary as with the old code change detection triggers for the camera all the time, which constantly resets the pathtracer accumulation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, nvm then 👍

));
}

fn add_raytracing_meshes_on_scene_load(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should expose this as a one-shot system or command if this is necessary for all glTF scenes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah this is just for the demo. Long term we can figure out a good way to handle things, but this is just a prototype atm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's intended that you can also use lower-resolution proxy meshes for RT.


/// A mesh component used for raytracing.
///
/// The mesh used in this component must have [`bevy_render::mesh::Mesh::enable_raytracing`] set to true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the failure case look like for these requirements? A warn/nice panic message is probably fine but if it's just a wgpu crash message or silent fail I think we need something better. I don't mind if that's in a followup tho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it silently skips it. BLAS will never get built, so scene binder will never pick it up.

Yeah a warning during BLAS build probably makes sense, I can add that in the future.

/// The material used for this entity must be [`MeshMaterial3d<StandardMaterial>`].
#[derive(Component, Clone, Debug, Default, Deref, DerefMut, Reflect, PartialEq, Eq, From)]
#[reflect(Component, Default, Clone, PartialEq)]
#[require(MeshMaterial3d<StandardMaterial>, Transform, SyncToRenderWorld)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is StandardMaterial a requirement because the standard gbuffer isn't integrated yet?

Copy link
Contributor

@ecoskey ecoskey May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine either way, but if so maybe make a note since this seems like something that could slip notice in the future:

Suggested change
#[require(MeshMaterial3d<StandardMaterial>, Transform, SyncToRenderWorld)]
// FIXME: remove standard material requirement once standard gbuffer is supported
#[require(MeshMaterial3d<StandardMaterial>, Transform, SyncToRenderWorld)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No StandardMaterial is always going to be required, at least until we have RT pipeline support in wgpu. Otherwise we have no way of knowing what color things are when tracing indirect rays off screen.

I was thinking maybe of making a simplified RT material that's separate from StandardMaterial so that users could still use custom materials for the primary view with raster, but idk yet, we can figure that out later.

label: Some("build_blas_command_encoder"),
});
command_encoder.build_acceleration_structures(&build_entries, &[]);
render_queue.submit([command_encoder.finish()]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be made into its own top-level render graph node, or does RT require an early submit call like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a render graph node, but I don't see much point in it.

In the future hopefully wgpu gets multi-queue and we can use a separate queue to build and compact BLAS's in parallel with rendering.

weak_handle!("9a69dbce-d718-4d10-841b-4025b28b525c");
const SAMPLING_SHADER_HANDLE: Handle<Shader> = weak_handle!("a3dd800b-9eaa-4e2b-8576-80b14b521ae9");

pub struct RaytracingScenePlugin;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps because English is not my native language, but personally, I think RayTracingScenePlugin is a better naming format than RaytracingScenePlugin. For example, the abbreviation for "Ray Traced Global Illumination" is "RTGI" instead of "RGI." Writing it as "RayTracing" feels more natural and comfortable to me.

Copy link
Contributor

@ecoskey ecoskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I dove into the math as much as I could, and I trust the other reviewers covered the rest. I'm very rasterizing-brained, so I wanted to clarify a few things to better review RT stuff in the future.

var ray_t_min = 0.0;

// Path trace
var radiance = vec3(0.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal but I think we should stick with using photometric units everywhere, just bc it's caused confusion before

Suggested change
var radiance = vec3(0.0);
var luminance = vec3(0.0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually going to do the opposite, and use radiometric everywhere for solari, which matches papers better.

let ray_hit = trace_ray(ray_origin, ray_direction, RAY_T_MIN, light_distance - RAY_T_MIN - RAY_T_MIN, RAY_FLAG_TERMINATE_ON_FIRST_HIT);
let visibility = f32(ray_hit.kind == RAY_QUERY_INTERSECTION_NONE);

let radiance = triangle_data.material.emissive.rgb * visibility * cos_theta_origin * (cos_theta_light / light_distance_squared);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

let light_id = rand_range_u(light_count, rng);
let light_source = light_sources[light_id];

var radiance: vec3<f32>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here as well

radiance += throughput * diffuse_brdf * sample_random_light(ray_hit.world_position, ray_hit.world_normal, &rng);

// Sample new ray direction from the material BRDF for next bounce
ray_direction = sample_cosine_hemisphere(ray_hit.world_normal, &rng);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a cosine weight always used for this, or does it become more complicated to importance sample with more complicated BRDFs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets more complicated for other BRDFs, but since this treats all surfaces as Lambertian, cosine-weighted sampling is optimal here.


let radiance = triangle_data.material.emissive.rgb * visibility * cos_theta_origin * (cos_theta_light / light_distance_squared);

let inverse_pdf = f32(triangle_count) * triangle_data.triangle_area;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the pdf here represent the relative contribution of this triangle compared to the whole mesh? Or is it more like the chance that the pathtracer would have sampled this in the first place? Asking because I wonder if (cos_theta_light / light_distance_squared) would fit better on this line so the term for projected triangle area is all in one place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PDF in this case is the PDF of the specific sampling strategy used here, which selects from the triangles with uniform probability, and then selects the point based on the selected tri's area.

When (if?) MIS is implemented in the future, it will require this PDF, not combined with cos_theta_light / light_distance_squared

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the PDF is basically the chance of choosing that light+triangle+point on the triangle+etc.

The cos_theta_light / light_distance_squared term is the "geometry term" used for converting from area measurement to solid angle measurement.

@alice-i-cecile alice-i-cecile added S-Needs-SME Decision or review from an SME is required and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 26, 2025
Copy link
Contributor

@Elabajaba Elabajaba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review the wgsl code, but otherwise seems good apart from needing some logging to be able to track down why certain meshes/scenes don't work.

(blas, blas_size)
}

fn is_mesh_raytracing_compatible(mesh: &Mesh) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log an error in this function if the mesh isn't compatible? It might help track down why those scenes weren't working when I tested them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was brought up above, I can add it to the next PR.

use bevy_transform::components::GlobalTransform;
use core::{hash::Hash, num::NonZeroU32, ops::Deref};

const MAX_MESH_SLAB_COUNT: NonZeroU32 = NonZeroU32::new(500).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, this is the number of meshes per slab, and not the total number of meshes in the scene?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the number of slabs to bind. The code is fairly temporary though, hopefully wgpu gets proper bindless support.

@JMS55
Copy link
Contributor Author

JMS55 commented May 28, 2025

I've changed this PR a little in the previous commit. The monolithic sample light function has been split up into 3 separate functions:

  1. Generate random light sample (pick light, pick point on light)
  2. Calculate unshadowed light contribution (given light sample and point, calculate radiance from light at point, don't trace rays to test visibility)
  3. Test light visibility (trace ray from given point to given light sample)

These APIs are what I've settled on based on prototyping the realtime/ReSTIR shaders.

I've also modified GpuDirectionalLight a little to move some of the calculations for solid angle / PDF from the GPU to the CPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Feature A new feature, making something new possible D-Complex Quite challenging from either a design or technical perspective. Ask for help! M-Needs-Release-Note Work that should be called out in the blog due to impact S-Needs-SME Decision or review from an SME is required X-Controversial There is active debate or serious implications around merging this PR
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

8 participants